Optimal Sequential Multi-Way Number Partitioning

نویسندگان

  • Richard E. Korf
  • Ethan L. Schreiber
  • Michael D. Moffitt
چکیده

Given a multiset of n positive integers, the NP-complete problem of number partitioning is to assign each integer to one of k subsets, such that the largest sum of the integers assigned to any subset is minimized. Last year, three different papers on optimally solving this problem appeared in the literature, two from the first two authors, and one from the third author. We resolve here competing claims of these papers, showing that different algorithms work best for different values of n and k, with orders of magnitude differences in their performance. We combine the best ideas from both approaches into a new algorithm, called sequential number partitioning, and also introduce a hybrid algorithm that achieves the best performance for each value of n and k. Number partitioning is closely related to bin-packing, and advances in either problem can be applied to the other. Introduction and Overview Given a multiset of n positive integers, the NP-complete problem of number partitioning is to assign each integer to one of k subsets, so that the largest sum of the integers assigned to any subset is minimized (Garey and Johnson 1979). For example, an optimal two-way partition of the integers {4, 5, 6, 7, 8} is {4, 5, 6} and {7, 8}, since both subsets sum to 15, which minimizes the largest subset sum. This is perhaps the simplest NP-complete problem to describe. One application of number partitioning is to scheduling. Given a set of n jobs, each with an associated running time, and a set of k identical machines, such as computers or CPU cores, schedule each job on a machine in order to complete all the jobs as soon as possible. The completion time of each machine is the sum of the running times of the jobs assigned to it, while the total completion time is the longest completion time of any machine. Another application is to voting manipulation (Walsh 2009). We begin with two-way partitioning, then consider multiway partitioning. In each case, we describe the relevant algorithms, introduce two new algorithms for multi-way partitioning, and compare their performance experimentally. Two-Way Number Partitioning The subset-sum problem is to find a subset of a set of integers whose sum is closest to a given target value. Two-way partitioning is a special case of this problem, where the target value is half the sum of all the integers. We describe five different optimal algorithms for these problems. A sixth algorithm, dynamic programming, is not competitive in either time or space (Korf and Schreiber 2013). Inclusion-Exclusion (IE) Perhaps the simplest way to generate all subsets of a given set is to search a binary tree depth-first, where each level corresponds to a different element. Each node includes the element on the left branch, and excludes it on the right branch. The leaves correspond to complete subsets. We first sort the integers, then consider them in decreasing order, searching the tree from left to right. We prune the tree as follows. If the integers included at a given node exceed the sum of the best subset found so far, we prune that node. Similarly, if including all the remaining integers below a node does not generate a subset sum better than the best so far, we prune that node as well. Complete Greedy Algorithm (CGA) A similar algorithm that finds better solutions sooner is the complete greedy algorithm (CGA). It also sorts the integers in decreasing order, assigning a different integer at each level, and searches the same tree, but reorders the branches. The left branch of each node assigns the next integer to the subset with the smaller sum so far, and the right branch assigns it to the subset with the larger sum. Thus, the first solution found is that returned by the obvious greedy heuristic for this problem. CGA keeps track of the larger subset sum of the best solution found so far, and prunes a branch when the sum of either subset equals or exceeds this value. Complete Karmarkar-Karp (CKK) An even better algorithm is based on a heuristic approximation originally called set differencing (Karmarkar and Karp 1982), but usually referred to as KK. KK sorts the integers in decreasing order, and at each step replaces the two largest integers with their difference. This is equivalent to separating the two largest integers in different subsets, without committing to their final placement. For example, placing 8 and 7 in different subsets is equivalent to placing a 1 in the subset the 8 is assigned to. The difference is then treated as another integer to be assigned. The algorithm continues until only one integer is left, which is the difference between the subset sums of the final partition. Some additional bookkeeping is needed to construct the actual partition. The KK heuristic finds much better solutions than the greedy heuristic. The Complete Karmarkar-Karp algorithm (CKK) is a complete optimal algorithm (Korf 1998). While KK always places the two largest integers in different subsets, the only other option is to place them in the same subset, by replacing them with their sum. Thus, CKK searches a binary tree where at each node the left branch replaces the two largest integers by their difference, and the right branch replaces them by their sum. The first solution found is the KK solution. If the largest integer equals or exceeds the sum of the remaining integers, they are placed in opposite subsets. The time complexity of IE, CGA, and CKK are all about O(2), where n is the number of integers. Pruning reduces this complexity slightly. Their space complexity is O(n). Horowitz and Sahni (HS) Horowitz and Sahni (HS) presented a faster algorithm for the subset sum problem. It divides the n integers into two “half” sets a and c, each of size n/2. Then it generates all 2 subsets of each half set, including the empty set. The two lists of subsets are sorted by their subset sums. Any subset of the original integers consists of a subset of the a integers concatenated with a subset of the c integers. Next, it initializes a pointer to the empty subset from the a list, and the complete subset from the c list. If the subset sum pointed to by the a pointer, plus the subset sum pointed to by the c pointer, is more than half the sum of all the integers, the c pointer is decremented to the subset with the next smaller sum. Alternatively, if the sum of the subset sums pointed to by the two pointers is less than half the total sum, the a pointer is incremented to the subset with the next larger sum. If the sum of the two subset sums equals half the total sum, the algorithm terminates. Else, HS continues until either list of subsets is exhausted, returning the best solution found. HS runs in O((n/2)2) time and O(2) space. This is much faster than IE, CGA and CKK, but its memory requirement limits it to about 50 integers. Schroeppel and Shamir (SS) The (Schroeppel and Shamir 1981) algorithm (SS) is based on HS, but uses much less space. HS uses the subsets from the a and c lists in order of their subset sums. Rather than generating, storing, and sorting all these subsets, SS generates them as needed in order of their subset sums. SS divides the n integers into four sets a, b, c and d, each of size n/4, generates all 2 subsets of each set, and sorts them in order of their sums. The subsets from the a and b lists are combined in a min heap that generates all subsets of elements from a and b in increasing order of their sums. Each element of the heap consists of a subset from the a list, and a subset from the b list. Initially, it contains all pairs combining the empty set from the a list with each subset from the b list. The top of the heap contains the pair whose subset sum is the current smallest. Whenever a pair (ai, bj) is popped off the top of the heap, it is replaced in the heap by a new pair (ai+1, bj). Similarly, the subsets from the c and d lists are combined in a max heap, which returns all subsets from the c and d lists in decreasing order of their sums. SS uses these heaps to generate the subset sums in sorted order, and combines them in the same way as the HS algorithm. SS runs in time O((n/4)2), but only requires O(2) space, making it practical for up to about 100 integers. A recent algorithm reduces this runtime to approximately O(2) (Howgrave-Graham and Joux 2010), but is probabilistic, solving only the decision problem for a given subset sum. It cannot prove there is no solution for a given sum, and doesn’t return the subset sum closest to a target value. Efficiently Generating Complement Sets For every subset generated, there is a complement subset. For efficiency, we do not want to generate both sets from scratch, but generate the complement sum by subtracting the original sum from the total sum. This optimization is obvious for the O(2) algorithms described above. For CGA, for example, we only put the largest number in one of the subsets. To implement this for HS or SS, we simply exclude the largest integer when generating the original sets. Performance of Two-Way Partitioning Asymptotic Complexity Our two-way partitioning algorithms fall into two classes: linear space and exponential space. Inclusion-exclusion (IE), the complete greedy algorithm (CGA) and the complete Karmarkar-Karp algorithm (CKK) each run in O(2) time, and use only O(n) space. Horowitz and Sahni (HS) runs in O((n/2)2) time and uses O(2) space, while SS runs in O((n/4)2) time and uses O(2) space. Choice of Benchmarks For the experiments in this paper, we use use integers chosen randomly and uniformly from zero to 2−1. The reason for such high-precision integers is to avoid perfect partitions. If the sum of all the integers is divisible by the number of subsets k, then all subset sums are equal in a perfect partition. Otherwise, the subset sums in a perfect partition differ by at most one. The example at the beginning of this paper is an example of a perfect partition. Once a perfect partition is found, search terminates immediately, since any perfect partition is optimal. This makes problem instances with perfect partitions easier to solve, and those without perfect partitions more difficult. Thus, we use high-precision integers to create hard problems without perfect partitions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimal Multi-Way Number Partitioning

of the Dissertation Optimal Multi-Way Number Partitioning

متن کامل

Search Strategies for Optimal Multi-Way Number Partitioning

The number partitioning problem seeks to divide a set of n numbers across k distinct subsets so as to minimize the sum of the largest partition. In this work, we develop a new optimal algorithm for multi-way number partitioning. A critical observation motivating our methodology is that a globally optimal k-way partition may be recursively constructed by obtaining suboptimal solutions to subprob...

متن کامل

Multi-Way Number Partitioning

The number partitioning problem is to divide a given set of integers into a collection of subsets, so that the sum of the numbers in each subset are as nearly equal as possible. While a very efficient algorithm exists for optimal two-way partitioning, it is not nearly as effective for multi-way partitioning. We develop two new linear-space algorithms for multi-way partitioning, and demonstrate ...

متن کامل

A Hybrid Recursive Multi-Way Number Partitioning Algorithm

The number partitioning problem is to divide a given set of n positive integers into k subsets, so that the sum of the numbers in each subset are as nearly equal as possible. While effective algorithms for two-way partitioning exist, multi-way partitioning is much more challenging. We introduce an improved algorithm for optimal multi-way partitioning, by combining several existing algorithms wi...

متن کامل

A distributed multilevel ant-colony algorithm for the multi-way graph partitioning

The graph-partitioning problem arises as a fundamental problem in many important scientific and engineering applications. A variety of optimisation methods are used for solving this problem and among them the meta-heuristics outstand for its efficiency and robustness. Here, we address the performance of the distributed multilevel ant-colony algorithm (DMACA), a meta-heuristic approach for solvi...

متن کامل

Parallel Multilevel Algorithms for Multi-Constraint Graph Partitioning

Sequential multi-constraint graph partitioning algorithms have been developed to address the load balancing requirements of multi-phase simulations. The e cient execution of large multi-phase simulations on high performance parallel computers requires that the multi-constraint partitionings are computed in parallel. This paper presents a parallel formulation of a recently developed multi-constr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014